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molecules by contacting a template nucleic acid with 
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that provide for the formation of a complementary tem- 
plate, and reacting the bifunctional chemical entities to 
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Description 

Technical Field of the Invention 



[0001] The present invention relates to compounds comprising a bifunctional chem.cal en t,ty which dunng forma .on 
of an encoded molecule, is positioned in a predominate direction. The positioning in a preferred d,rect,on of the brfunc- 
^onalchSical^itymayeLlarnongotherthingstn 

Z relates to nucleotide analogues that are substrates for a polymerase and to a double stranded nucleic acd com- 
prising the compounds of the invention. 

Background of the Invention 

100021 Biological systems allow for the synthesis of o-peptides by a process known as translation. The natural trans- 
ation process requires a m RNA template and a plurality of charged tRNA building blocks. A ribosome ,s initially attached 
o the mRNA template and directs the recognftion between anti-codons of tRNA building blocks » and comp emen.ng 
codons of the mRNA template. Concurrently with the recognition process a a-am,no acd residue of the tRNA ,s reacted 

with a nascent polypeptide to extent nascent polypeptide with a monomer unit. 

(0003] Recently, a method has been suggested in WO 02/1 03008 A2 (the content of which be,ng mcorporatedhere.n 
n its entirety by reference) for producing other encoded molecules than a-peptides. According to one embod ment o 
his publication a nucleic acid template and a plurality of nucleoside derivative building blocks carrying agonal 
entitj is provided. Using the nucleic acid template, complementing nucleoside derivatives are incorporated n c , a com- 
plementary strand. Subsequent to, or simultaneous with the formation of the complementing template the functional 
eZesaZ reacted toform an encoded moiecule. The end product of the process is a bifunctional complex compnsmg 
an encoded molecule attached to the complementing template. 

[0004] The present invention relates to certain aspects of the above publication, especially aspects in which a bi- 
functional chemical entity is attached to a nucleoside derivative, as a special s* ual tion ^^^TfT^ 
tional chemical entities due to a potential free rotation around the linker-nucleotide bond. As llustrated I ml fig. 1 , a 
bifunctional chemical entity bears two different reactive groups and T, e.g. both l ^ | l, | J 
where "X' on one chemical entity is meant to react with >r on the neighbour chemical entity, either directly or through 
across-linking agent. If all linker-chemical entity units orient identically with respect to the parent nucleotide .^ectionai 
polymerization will take place and a complete product of say 5 units will be formed. Howevenro™ 
bond of some, but not all, linker«hemical entities so that the relative orientation cf the two funct, °"a'^ s r ^ es 
leads to a clustering situation, where the reacting groups are arranged so that reaction can take place ,n ^o »en 
directions. This unfavourable situation can be avoided by using fixed functional entities mereby preventing romt.on 
Snd the nucleotide-linker bond. Fixing the chemical entities may be obtained by attaching the (Mn^nM 
by one but by two covalent bonds (i.e. two linkers) to the nucleotide. The additional bond may be formed d.rectiy by 
one of the functionalities, or the two reactive groups may be attached by separate 'arms' on a fixed backbone. In the 
°rst situation the additional bond may be broken during the reaction, whereas the additional bond in the latter should 
be constructed so that also this bond is cleavable after reaction , to release the final product. 

[0005] The present invention aims at providing compounds which tends to be directed in a predominate faction so 
hat a cluster formation is avoided. In general, directional encoding will lead to full-length products and .nadd. ton to 
products of known polarity. Thus an unambiguous relationship between the genetic information of the template and 
the encoded molecule may be obtained. 

Summary of the Invention 

[0006] The present invention concerns a compound comprising a bifunctional chemical entity attached to a nucleo- 
side derivative and a directing element capable of positioning the bifunctional chemical entity ,n a predominate direction. 
rOOOTl DNA made up of a double helix has an inherent twist of the backbone, positioning one base pair not on top 
of the previous but shifted by a certain distance and in turn rotated by 36 degrees. This means that the ^^etween 
ne ghbouring attachment points is larger (typically on the order of 4.4 A) than the vertical distance between , tw base 
pairs (3.4 A) In addition, the rotation results in different starting directions of the linkers resulting ,n increasing distance 
between equal atoms of neighbouring stiff linkers. „„„. 
[0008] Forbifunctional entities attachedto a nucleotide, there are two possibilities of reaction: the functiona lly placed 
n the 3' direction of the nucleic acid strand can attack a neighbouring functionality positioned In the 5 direction of the 
nucleic acid strand orthe functionality placed in the 5' direction can attack the neighbounng funct onalrty PoMonad m 
the 3' direction of the nucleic acid strand. Due to the twist of DNA these two reactions are geometrically different, and 
the reaction distance of the 3"-5' reaction is significantly shorter than the reaction distance of the 5-3 reaction - simply 
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due to the inherent twist of DNA. The present invention takes advantage of the geometry of the DNA double helix, 
whereby directionality can be obtained without covalently constraining the linkers. By incorporating a directing element 
in the vicinity of the functional entity the bifunctional chemical entities will be influenced by the DNA environment. By 
careful design of linkers this can be utilised for directionality purposes. 

5 [0009] Generally, the directing element interacts with one or more major groove atoms or an internucleoside linkage 
of a double helix nucleic acid. The interaction may be of any appropriate nature, e.g. the directing element is attracted 
or repulsed by major groove or internucleoside linkage atoms of a nucleic acid double helix. Importantly, the directing 
element must interact in a way that positions the bifunctional chemical entity in a certain preferred position. The attrac- 
tion and/or repulsion may involve interaction between single atoms or between groups of atoms. The interaction be- 

io tween single atoms or groups of atoms may be an ionic attraction or repulsion. The interaction between groups of 
atoms may involve hydrophobic/hydrophilic interaction, van der Waal interaction etc. By preferred or predominate ori- 
entation or direction of the bifunctional chemical entity is meant that the bifunctional chemical entity is prone to be 
positioned in a certain direction in more than 50%, preferably 70% and most preferred 90% of the time. The time a 
bifunctional entity is in a certain direction may be calculated by evaluating the possibility of each of the possible con- 

15 formations. In general, the term conformation refers to individual structural orientations differing by simple rotation 
about single bonds. Different conformations may in addition give rise to different overall configurations, by which is 
meant an overall arrangement of bifunctional chemical entities on all modified nucleosides that give rise to one specific 
direction of reaction. As an example, four bifunctional chemical entities arranged with all Xs' in the same direction 
corresponds to one specific configuration, and four bifunctional chemical entities arranged e.g. with two 'X's' pointing 

20 jn one direction and the two other in the opposite direction corresponds to another specific configuration. Within one 
configuration many different conformations are possible, but all of these result in the same 'most probable' product 
since the overall orientation (direction) of reactive groups is preserved. 

[0010] The directing element may by chosen among a variety of different chemical components. In an aspect of the 
invention the directing element comprises an optionally substituted 5-, 6-, or 7-membered ring structure. In a preferred 

25 aspect the directing element comprises an aromatic or hetero-aromatic ring system. In order to obtain a sufficiently 
high influence of the directing element on the bifunctional entity, it may be desired to position the bifunctional chemical 
entity and the directing element within a certain distance. In a preferred aspect of the invention, the extended length 
between the directing element and the bifunctional chemical entity is 4A or less. The term extended length means 
herein the conformation that leads to the longest distance. 

30 [0011] The directing element can be positioned relative to the bifunctional chemical entity and the nucleoside deriv- 
ative in any appropriate way. In one aspect of the invention, the directing element is positioned in the linkage between 
the nucleoside derivative and the bifunctional chemical entity. In another embodiment the bifunctional chemical entity 
is positioned between a linkage to the nucleoside derivative and the directing element. In a still further aspect of the 
invention, the directing element is attached to a linkage connecting the bifunctional chemical entity and the nucleoside 

35 derivative. 

[0012] The functionalities of the bifunctional chemical entities may be chosen within a wide range of reactive groups. 
In one aspect of the invention, the two functionalities are capable of reacting with each other. Notably, the bifunctional 
chemical entity comprises a nucleophile and an electrophile as the two functionalities. An example of a nucleophlile is 
an amine and an example of an electrophile is a carboxylic acid or an ester. In an aspect of the invention, the func- 

40 tionalities are protected by suitable protection groups. 

[0013] The compound of the invention may be composed solely of the nucleoside derivative, the directing element 
and the bifunctional chemical entity. However, in some aspects the bifunctional chemical entity is attached to the nu- 
cleoside derivative through a spacing element. The spacing element serves a variety of functions when present, the 
main function being distancing the bifunctional chemical entity from the nucleoside derivative. In one aspect, the spacing 

45 element comprises a chemical bond which includes electrons from overlapping p-orbitals. Suitably, the spacing element 
includes a triple bond; an aromatic- or heteroaromatic ring system, or a hetero atom. 

[0014] The distance between the bifunctional chemical entity and the nucleoside derivative is suitably chosen such 
that a suitable reactivity is obtained. In certain aspects, the extended length between the bifunctional chemical entity 
and the nucleoside is between 3 and 12A. 

50 [0015] The bifunctional chemical entity can be attached to any position of the nucleotide derivative. In a preferred 
aspect the bifunctional chemical entity is attached through a linker to the nucleobase of the nucleoside. The nucieobase 
derivative may be a naturally occurring nucleobase or a synthetic nucleobase. In some aspects of the invention, the 
nucleobase is selected among adenine, 7-dea2a-adenine, uracil, guanidine, 7-deaza-guanidine, thymine, andcytosine. 
In preferred aspects of the invention, the bifunctional chemical entity is attached through a linker to the 5 position of 

55 pyrimidine type nucleobases or the 7 or 8 position of purine type nucleobases. 

[0016] The invention also relates to nucleotide analogues comprising the above nucleotide derivative attached to a 
bifunctional molecule. The nucleotide analogue may be incoporated into a complementary strand using various means, 
e.g. chemical ligation or enzymatic polymerisation. Usually it is preferred to use a polymerase or a ligase to incorporate 
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the nucleotide analogue. Therefore, the nucleotide analogue is preferably a substrate for a polymerase or a ligase. An 
example of a substrate for polymerases is nucleotide triphosphates. The nucleotide is usually a mononucleotide tri- 
phosphate but may be an oligonucleotide triphosphate as using the teaching of WO 01/1 6366, the content of which is 
incorporated herein by reference. An example of a substrate for a ligase is an oligonucleotide monophosphate. 

s [001 7] The invention is f urthenmore directed to a method for directing the structural orientation of a bif unctional chem- 
ical entity, wherein a nucleoside derivative comprising a bifunctional chemical entity further comprises a directing ele- 
ment capable of positioning the bifunctional chemical entity in a predominate direction. In a preferred aspect, the di- 
recting element interacts with the major groove atoms or internucleoside linkage of a double helix nucleic acid. More 
preferred, the directing element comprises an optionally substituted 5-, 6-, or 7-membered ring structure. In some 

10 aspects, the directing element comprises an aromatic or hetero-aromatic ring system. 

[0018] The invention also relates to a method for obtaining an encoded molecule comprising contacting a template 
nucleic acid with one or a plurality of compounds according to the invention, under conditions that provide for the 
formation of a complementary template, and reacting the bifunctional chemical entities to form an encoded molecule. 
[0019] The template comprises a sequence of nucleotides which is complemented by the compounds of the invention 

15 and incorporated into a complementary strand. When a polymerase is used for incorporation a primer is usually initially 
annealed to the template to obtain a site for the polymerase to bind. Subsequently, each of a plurality of nucleotides 
is incorporated into a complementary strand by extending the primer. In one aspect of the invention, native nucleobases 
or close analogous are used in the compounds of the invention to obtain a conventional Watson-Crick base pairing 
scenario, i.e. an A forms a specific base-pair with T and C forms a specific base-pair with G. The specific base-pairing 

20 allows the genetic encoding of the bifunctional chemical entity in the final encoded molecule because it is possible to 
decode the template or alternatively the complementing template to establish the synthesis history of the encoded 
molecule. 

[0020] In an aspect of the invention a plurality of templates is provided to produce a library of encoded molecules. 
The library of encoded molecules has a variety of uses, e.g. as possible ligands to a pharmaceutical interesting target 

25 or as a vehicle for carrying a pharmaceutical active substance into a tissue or cell of interest. 

[0021] The encoded molecule is generally a polymer in the sense that a head-to-tail reaction of the bifunctional 
chemical entities occurs. It is to be understood that the units of the polymer may be identical or different and the type 
of reaction may vary over the encoded molecule. Using a single nucleobase in the compound of the invention allows 
for four different bifunctional chemical entities, if a one-to- one relationship between the genetic information of the 

30 nucleobase and the identity of the bifunctional chemical entity is to be maintained. However, using two or more nucle- 
obases in the compound of the invention allows for the formation of a more diverse monomer composition. 
[0022] The invention also is directed to a double stranded nucleic acid having one or more compounds of the invention 
incorporated therein. In a preferred aspect the bifunctional chemical entities have been reacted under conditions in 
which they had a predominate orientation. After the reaction or simultaneously with the reaction, one or more linking 

35 moieties may be cleaved to efficiently display the encoded molecule. 

Brief Description of the Figures 
[0023] 

40 

Fig. 1 shows a situation in which a cluster is formed due to lack of directionality of the bifunctional chemical entity. 
Fig. 2 discloses a general composition of the compound of the invention, 

Fig. 3 shows a design of the linker in which the directing element is between a spacing element and a bifunctional 
entity. 

45 Fig. 4 discloses a design of the linker in which the directing element is attached to the spacing element. 

Fig. 5 shows a design of the compound in which the bifunctional chemical entity is between the spacing element 
and the directing element. 

Fig. 6 discloses a predominate position of a bifunctional entity of example 1 relative to the DNA double helix. 
Fig. 7 shows a diagram of reaction distances as function of conformation energies for the compound of Example 1 . 
so Fig. 8 depicts two conformations of the compound of example 2 when incorporated into a DNA double helix. 

Fig. 9 discloses a diagram of reaction distance as function of conformation energies for the compound of Example 2. 

Detailed Description of the Invention 

55 [0024] In the discussion of the present invention it is convenient to introduce the term linker, as a residue or chemical 
bond separating the bifunctional chemical entity (BE) and the nucleoside derivative (ND), see Fig. 2. The linker may 
or may not include a spacing element and/or the directing element. In a preferred aspect the linker is of a specified 
length and comprised of two elements: a spacing element (SE) and a directing element (DE), thereby ensuring reaction 



4 



EP 1 533 385 A1 



in the vicinity of the double-stranded DNA (comprised of the template and the linked complementing units) providing 
a specific overall orientation of Afunctional units and thereby a high degree of directionality of polymerisation. 
[0025] The directing element can be linking the SE and the Afunctional chemical entity (Fig. 3), the DE can be 
attached the SE as a substituent so that the SE directly links the nucleoside derivative and the BE (Fig. 4), or attema- 

5 tively, the DE can be attached at the outer side of the BE (Fig. 5). 

[0026] The purpose of the spacing element is to ensure some distance between the BE and the nucleotide derivative 
typically no more than a favourable van-der-Waal distance just excluding water intrusion between the DNA and the 
Afunctional chemical entity. The SE is preferably comprised of a chemical unit bearing n-electrons, i.e. a triple bond, 
an aromatic or heteroaromatic ring system, or a heteroatom such as O, S, or N. The minimum length of the spacing 

10 element is 0 atoms, i.e. the spacing element is absent. When present, the length of the spacing element may be any 
appropriate number of atoms. Usually, the number of atoms is 6 or less. 

[0027] The purpose of the directing element is to position concurrent bifunctional chemical entities in a consistent 
manner, e.g. either all horizontally towards the major groove or all vertically towards the major groove. This is achieved 
through obtaining favourable interactions between major groove atoms of DNA and atoms of the DE and as well be- 

15 tween the concurrent DE's. In addition, by use of various substituents, steric hindrance of one but not the other reaction 
can be achieved. The DE is preferably comprised of a substituted or unsubstituted 5-, 6-, or 7-membered aromatic/ 
heteroaromatic ring system, characterised by potentially having a stacking effect. Counting the shortest distance be- 
tween SE and BE attachment points, the DE usually has a minimum length of 3 atoms and a maximum length of 6 
atoms. In cases where the DE is attached as a substituent at the SE or at the BE, the typical length is 3 to 6 atoms 

20 plus additional atoms from potential ring substituents. If the DE is attached at the outer end of the FE, the link should 
be through a specifically cleavable traceless construction; typically an orto-nitrobenzyl unit attached an amine. 
[0028] The total linker length (i.e. without the size of the bifunctional entity) is in general from 3 atoms and up to 11 
atoms. Combining the different elements in the most efficient way results in a total length (including the bifunctional 
entity) of between 8 and 1 5 atoms, resulting in a typical extended length of 8 to 1 6 A, and a typical non-extended length 

25 of 8-12 A. 

[0029] This length is comparable to the dimensions of the DNA double helix which has a total diameter of approxi- 
mately 18 A, a cross width of the major groove of approximately 17 A, and a minimum distance between the linker- 
attachment of the base and the closest phosphate group of the opposite strand of 12.5 A. Detailed examples are given 
in the attached examples. 

30 

Nucleotides 

[0030] The nucleotides used in the present invention may be linked together in a sequence of nucleotides, i.e. an 
oligonucleotide. Each nucleotide monomer is normally composed of two parts, namely a nucleobase moiety, and a 

35 backbone. The backbone may in some cases be subdivided into a sugar moiety and an intemucleoside linker. 

[0031] The nucleobase moiety may be selected among naturally occurring nucleobases as well as non-naturally 
occurring nucleobases. Thus, " nucleobase" includes not only the known purine and pyrimidine hetero-cycles, but also 
heterocyclic analogues and tautomers thereof. Illustrative examples of nucleobases are adenine, guanine, thymine, 
cytosine, uracil, purine, xanthine, diaminopurine, 8-oxo-N 6 -methyladenine, 7-deazaxanthine, 7-deazaguanine, N 4 ,N 4 - 

40 ethanocyiosin, N 6 ,N 6 -ethano-2,6-diaminopurine, 5-methylcytosine, 5-(C 3 -C 6 )-alkynylcytosine, 5-fluorouracil, 5-bro- 
mouracil, pseudoisocytosine, 2-hydroxy-5-methyl-4-triazolopyridine, isocytosine, isoguanine, inosine and the "non- 
naturally occurring ,, nucleobases described in Benner et al., U.S. Pat No. 5,432,272. The term "nudeobase" is intended 
to cover these examples as well as analogues and tautomers thereof. Especially interesting nucleobases are adenine, 
guanine, thymine, cytosine, 5-methylcytosine, and uracil, which are considered as the naturally occurring nucleobases. 

45 



50 



55 
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Examples of suitable specific pairs of nucleobases are shown below: 
Natural Base Pairs 
5 [0032] 



15 



HH 7 



Backbone 



Adenine 



R=H: Uracil 
R=CH 3 : Thymine 



Cytosine 



cc 



H 2 N 



A ° 

N NH2 



Guanine 



Synthetic 
20 [0033] 



Base Pairs 



25 



30 



I HN^>~N 
N^N fl 

^ ° 

Backbone 

1 N- Backbone 

' * J. MM 



NH \ 



NH 
NH2 



Y" 



NH 2 
Backbone 



N 

N^NH 2 
Backbone 



N*=\ 



N^N & 
Backbone N^ 



O 

C 



,N- Backbone 



NH NH2 



40 



Synthetic purine bases pairring with natural pyrlmldlnes 
[0034] 



NHj HN — / 
H I \\ 



R=H: Uracil 
R-CH 3 . Thymine 



Cytosine 



Backbone 

7-deaza adenine 



o "> H 

Backbone 

7-deaza guanine 



55 



[0035] Suitable examples of backbone units are shown beiow (B denotes a nucleobase): 
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25 



30 



35 



40 



^ °W °W ¥ °w 



o 

o=£-o- 

DNA 



9 

o=£-s- 

Phosphorthioaic 

B 



6 o 
o=£-o- 

Oxy-LNA 

o=£-o- 

2'-0-MelhyI 
I 

B 



6 s 
o=p-o- 

Thio-LNA 



O N 
0=P-0' R 

Amino LN A 
R = -H, -CH 3 

B 



d oh 
o- 

RNA 



o 



2--MOE 



o 

0=£-CT 
2'-Fluoro 

B 



o 

o=£-o- 

2'F-ANA 



°T° S H J / 




HNA 



CcNA 



PNA 



°1A 

Morpholino 



N CD 9 



OH 



2 '-(3 -hydroxy >propyl 



0=P-0- 0=P-BH 3 - 
a'-Phosphoramidatc Boranophosphatcs 



O 

O-P-O" 
TNA 



[0036] The sugar moiety of the backbone is suitably a pentose but may be the appropriate part of an PNA or a six- 
membered ring. Suitable examples of possible pentoses include ribose, 2'-deoxyribose s 2'-0-methyl-ribose, 2'-flour- 
ribose, and 2 , -4'-0-methylene-ribose (LNA). Suitably the nucleobase is attached to the 1 * position of the pentose entity. 
[0037] An intemucleoside linker connects the 3' end of preceding monomer to a 5* end of a succeeding monomer 
when the sugar moiety of the backbone is a pentose, like ribose or 2-deoxyribose. The intemucleoside linkage may 
be the natural occurring phospodiester linkage or a derivative thereof. Examples of such derivatives include phospho- 
rothioate, methylphosphonate, phosphoramidate, phosphotriester, and phosphodithioate. Furthermore, the intemucl- 
eoside linker can be any of a number of non-phosphorous-containing linkers known in the art. 

[0038] Preferred nucleic acid monomers include naturally occurring nucleosides forming part of the DNA as well as 
the RNA family connected through phosphodiester linkages. The members of the DNA family include deoxyadenosine, 
deoxyguanosine, deoxythymidine, and deoxycytidine. The members of the RNA family include adenosine, guanosine, 
uridine, cytidine, and inosine. Inosine is a non-specific pairing nucleoside and may be used as universal base because 
inosine can pair nearly isoenergetically with A, T, and C. Other compounds having the same ability of non-specifically 
base-pairing with natural nucleobases have been formed. Suitable compounds which may be utilized in the present 
invention includes among others the compounds depicted below 
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Examples of Universal Bases: 
[0039] 






dP <JK Nebularine 

Examples 

General 

[0040] Employing chemical entities having two different reacting groups capable of reacting with each other leads 
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15 



20 



30 



to the possibility of two different reactions, either the 3' nucleophile reacts with the 5' electrophile (3*-5' reaction) or the 
5' nucleophile reacts with the 3' electrophile (5'-3' reaction). As explained above, the structure of the DNA double helix 
leads to different geometric nature of the two reactions. By rational linker design this difference can be exploited re- 
sulting in building blocks favouring one reaction directionality for the other, resulting in a larger fraction of full-length 
product and a known polarity. 

[0041] Computer calculations can provide a measure of the probability of the two reactions. The purpose of these 
calculations is to analyse various modes of attack for each linker-BE construction, estimating the most probable reaction 
and thereby the most probable product. Therefore, the conformational space covered by the linker-BE unit and the 
zones occupied by the reactive groups needs to be estimated. 

[0042] The conformational space of a specific linker-BE system, i.e. the range of the BE, can be estimated by doing 
a conformational search. Conformational searches can be performed employing various different software products 
and within these programs using different searching methods, as is standard knowledge within the field. For systems 
of the size mentioned herein it is not possible to perform a converged conformational search, that is, to ensure that 
enough steps have been taken so that the complete potential energy surface has been covered and thereby that the 
located minimum energy conformation is truly the global minimum for the molecule. However, the purpose of these 
calculations is to get a picture of the space allowed to be covered by a linker-BE unit and thereby to estimate the most 
likely approach of attack between two BEs and the possibility for the reacting groups to get within reaction distance. 
Efficient conformational searching methods employing a rather limited number of steps fulfil this purpose. 
[0043] By conformations are here meant individual structural orientations differing by simple rotation about single 
bonds. Different conformations may in addition give rise to different overall configurations, by which is meant an overall 
arrangement of the two reactive groups on all modified nucleotides that give rise to one specific direction of reaction. 
[0044] Referring to Fig. 1 , the four linker-BE units arranged with ail 'X's' in the same direction corresponds to one 
specific configuration, and four linker-BE units arranged e.g. with two 'X's' pointing in one direction and the two other 
in the opposite direction corresponds to another specific configuration. Within one configuration many different con- 
formations are possible, but all of these result in the same 'most probable' product since the overall orientation (direction) 
of reactive groups is preserved. 

[0045] The calculations performed in this investigation have employed the Macro Model 8.0 software from Schrodinger 
Inc (MMOD80). Within this program package a series of different searching protocols are available, including the 'Mixed 
Monte Carlo Multiple Minimum/Low Mode' method (MCMM/LM), shown to be very effective in locating energy minima 
for large complicated systems. 

[0046] Having covered the conformational space to a reasonable extent, the structures can be analysed, as follows. 
It is important to note, that the distances mentioned in the following are reactant distances, i.e. minimum energy dis- 
tances, and will never be in the range of reaction (i.e. transition state) distances. The possibility therefore exists that 
the reactant distances for a specific attack seems favourable but that the two groups never can get closer than that, i. 
e. are incapable of getting within realistic reaction distance. It is therefore necessary to analyse whether the minimum 
energy structures by simple dihedral rotations will result in structures having the two reacting groups within reaction 
distance. 

[0047] For every conformation within a specified energetic interval (e.g. within 50 kJ/mol from the 'global minimum'), 
the 3'-5' and 5'-3' distances are measured in A (d3* and d5\ respectively) and the energy in kJ/mol. Subtracting 5'-3' 
from 3'-5' (d3'-d5') gives a measure of the absolute difference in distance between the two reactions. However, since 
only conformations where one of the two distances is short will lead to reaction it is necessary to divide by the min(d3', 
d5'). 



45 



Example: 



3' favourable: d3'=3 A, d5'=7 A 
5' favourable: d3'=7 A, d5'=3 A 
no reaction: d3'=10 A, d5'=13 A 



(d3'-d5')/min(d3',d5 , )=-1 .33 
(d3'-d5 , )/min(d3',d5 , )=1 .33 
(d3'-d5 , )/min(d3',d5')=-0.3 



50 



55 



[0048] That is, conformations capable of giving 3' reaction will show by large negative values, conformations leading 
to 5' reaction by large positive values, and non-reactive conformations will be identified by small positive or negative 
values. 

Plotting these preference values as function of conformational energy and in addition colour code the bars according 
to min(d3\d5') such that the interesting conformations show in a significant colour leads to a very easy visualization 
of the linker directionality efficiency. 

[0049] A favourable linker design with a large preference for only one of the reactions will thus be shown as having 
either exclusively large negative or large positive values. Small-value conformations do not constitute a problem with 
respect to cluster formation but will be 'dead' conformations and as a worst case scenario lead to small overall reaction 
probability. In addition, since the relative weight of a conformation decreases with increasing conformation energy, 
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more sianificance is placed on the lower energy conformations. 

roosor Examples of plots showing linkers leading to a) 3' preference and b) 5' preference are g,ven in Hg. 7 and Rg. 
rrespectivety The colour code is grey-scale in these figures, where conformations show.ng h,gh direcfonalrty are 

InZ^wav^f^presentina the directionality of a bifunctional entrty is by numbers describing the percentage of con- 
forms W ^thin a ceS tendency Lards a specified reaction direction. As an example of a , th«h« can 
LTZ^L' d5-Vmin(d3' d5')<-0 75 indicative of 3' directionality and (d3'-d5-)/min(d3',d5')>0,75 md.cat.ve of 5 direc- 
SonaTty SheTeSfonaU oUhe linker is then given by 

%5'=Sonf (d3 d5-)/min(d3',d5>0,75)Aotal # conf . A favourable linker design will be shown by a large difference be- 
iee theiwo numbers. In Figures 7 and 9 the corresponding % directionalrty number are ,nd,cated. 



Example 1. Compound type ND-SE-DE-BE 
[0051] 




Computational details 

[00521 Double-strandedDNAwiththebasesequence5'-GCTTT^ 

from HvoerCube Inc in the most frequent B-conformation. The linker-BE units were bu.lt using ChemDraw Ultra 6.0 
and Che P m?D Ja 6 0 Lm ChemOffice. Linker-BE units and DMA were imported into MMOD80 Th. -kers were 
^hen fcseTto the Jo mid nucleotides using the build feature in MMOD80, fusing the methy carbon a torn of th , T 
nucleoUaes wl* thTappropriate linker atom, in effect creating a modified U nucleotide. In the calculation all DNA atoms 

wrth!n the DNA strand. The total system (keeping the DNA atoms frozen) was energy m,n,m,sed < CO ^5=0_05) 
W ™ -„ thToPLS-AA force field (FFLD, arg1=11) and the GB/SA water solvation model suppl.ed .n MMOD80. 
Co oZtio^a. anats^s were perforrTed using the MCMM/LM method (LMCS keyword), running 1000 steps (arg 
? 100N SrqStorc^nadonofthe first 10 modes (arg3=-10), and with a minimum and max.mum 
iZce S the"Sest moving atom of 3 and 8 A, respective* (arg7=3, «^SSS^S^ 
C^inthetwo.inke^^^^^ 

waar:,=^ 

imized by 500 PR Conjugate Gradient steps. The lowest energy conformation is d.splayed ,n F.gure 6. 



Results 



[00531 Running 1000 conformational search steps results in 393 unique conformations with the global m.n.mum 
Seated once ™ e lowest-energy conformation is displayed in Figure 6 and is clearly biased towards 3' react.on oVec- 
ton The tSTo l^keTorient towards the DNA major groove in a regular way, providing stack.ng of the two d.rect.ng 
events lo benzene Sgs) and favourable van-der-Waals interactions between linker and DNA atoms. This onen- 
Sn es 7s in a reaction distance for the 3' attack (amino group of the topmost ^b^nctonal ent,ty (3 end) reatf ng 
with the ester group of the lower building block (S end)) of 5.5 A as compared to th d stance for the 5 attack^ ammo 
a ouo of the lower building block (5' end) reacting with the ester group of the upper building block (3 end ) of 9.5 A. It 
f s tmooiantto note here that these structures are minimum energy structures and thus the distances w.M never be .n 

Loth >l merqMcally to the displayed minimum structure, and a linker leading to a m.n.mum structure 
show^oK 
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of leading to 3' directionality also at the transition structure level. 

[0054] Figure 7 shows the directionality plot of the calculation, according to the reaction distance conversions de- 
scribed in the text above. The plot shows a very clear tendency of this linker to favour a 3' reaction direction, since 
only very few and high-energy conformations result in positive valued bars. The corresponding %directionality numbers 
5 are %3'=36.6 and %5'=2.9. 

[0055] Thus, use of this linker provides large 3' directionality and in addition, since %3' is a significant high number, 
a very high reaction probability. 

Example 2. compound type NDSE-DE-BE 

w 

[0056] 



15 




20 Computational details 

[0057] Double-stranded DNA with the base sequence 5'-GCi i i iAG-3' (upper strand) was built using HyperChem7 
from HyperCube Inc in the most frequent B-conformation. The linker-FE units were built using ChemDraw Ultra 6.0 
and Chem3D Ultra 6.0 from ChemOffice. Linker-FE units and DNA were imported into MMOD80. The linkers were 

25 then fused to the two mid nucleotides using the build feature in MMOD80, fusing the methyl carbon atom of the T 
nucleotides with the appropriate linker atom, in effect creating a modified U nucleotide. In the calculation all DNA atoms 
were kept frozen, that is, were not allowed to move, in order to decrease the size of the systems and to avoid distortions 
within the DNA strand. The total system (keeping the DNA atoms frozen) was energy minimised (CONV, arg5=0.05) 
employing the OPLS-AA force field (FFLD, arg1=11) and the GB/SA water solvation model supplied in MMOD80. 

30 Conformational analyses were performed using the MCMM/LM method (LMCS keyword), running 1000 steps (arg 
1=1 000), exploring a random linear combination of the first 10 modes (arg3=-10), and with a minimum and maximum 
distance travelled by the fastest moving atom of 3 and 8 A, respectively (arg7=3, arg8=8). Extended cut-off distances 
of 100, 1 00, and 4 for van der Waal, electrostatics, and hydrogen bonds, respectively were used in all calculations. A 
maximum of 13 torsions (all within the two linker parts of the total system) were allowed to be changed in a MC step 

35 (MCNV, arg1=2, arg2=13) and the energy cut-off was 50 kJ/mol (MCSS, arg5=50.0). Finally each conformer was min- 
imized by 500 PR Conjugate Gradient steps. The lowest energy conformation is displayed in Figure 8 left, and a 5* 
favourable configuration in Figure 8 right. 

Results 

40 

[0058] Running 1000 conformational search steps results in 566 unique conformations with the 'global' minimum 
located once. Two low-energy conformations are displayed in Figure 8 left and right, conf 1 and 16 respectively. Con- 
formation 1 has a linker orientation presumably favouring 3' attack, however, the reaction distance is quite long (7.7 A 
as compared to 11.7 for 5* attack) and the reaction is unlikely. Conformation 16 shows a 5' biasing linker orientation 
45 with a favourable 5' reaction distance (3.6 A as compared to 8.9 for reaction from 3' direction), well in the range of 
being close to a reactive transition structure. An overall vertical arrangement for this linker gives a more favourable 
stacking interaction than a horizontal arrangement and allows for a closer proximity of the linkers and the DNA major 
groove. 

[0059] Figure 9 shows the directionality plot of the calculation, according to the reaction distance conversions de- 
50 scribed above. It can be seen that the first few low-energy conformations are on the borderline of giving rise to 3' 
directed reactions (negative-valued bars). However, by far the majority of the subsequent conformations are biased 
towards 5* attack (large positive-valued bars), and the overall appearance of the plot is clearly towards 5' attack. The 
corresponding %directionality numbers are %3=1 .4 and %5'=21.6. Thus, use of the present linker provides large 5' 
directionality and in addition, since %5' is a high number, a high reaction probability. 

55 

Example 3. A summary of a series of linkers. 

[0060] As described in the above examples it is possible by computational analyses to determine the likeliness of a 
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specific linker construction being directional or not. In the table below results are listed for a series of different linkers, 
demonstrating the broad application of the design rules set up in the present invention. All calculations have been 
performed as described in Examples 1 and 2. 
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Example 4: Examples of general linker construction of the type ND-SE-DE-FE 
30 [0061] 



Linker + BE 


#SE 
atoms 


#DE 
atoms 


Total 
# 
at- 
oms 




















0 


3 


8 


NH 2 








NH 2 


2 


3 


10 



14 



EP 1 533 385 A1 




15 



EP 1 533 385 A1 




16 



EP 1 533 385 A1 
Example 5. Examples of general linker constructions of the type 

CE-SE-FE 
I 

DE 



[0062] 



Linker + BE 


# SE 
atoms 


# DE 
atoms 


# total 
atoms 


AW I 

• ! II 
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9 
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Example 6. Examples of general linker constructions of the type ND-SE-BE-DE 
[0063] 



Linker + BE 


#SE 
atoms 


#DE 
atoms 


Total 
# at- 
oms 


O N0 2 
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Claims 

1. A compound comprising a bifunctional chemical entity attached to a nucleoside derivative and a directing element 
capable of positioning the bifunctional chemical entity in a predominate direction. 
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2. The compound of claim 1 , wherein the directing element interacts with one or more major groove atoms or an 
intemucleoside linkage of a double helix nucleic acid. 

3. The compound of claim 1 or 2, wherein the directing element comprises an optionally substituted 5-, 6-, or 7-menv 
5 bered ring structure. 

4. The compound of claim 3 t wherein the directing element comprises an aromatic or hetero-aromatic ring system. 

5. The compound of claim 1 to 4, wherein the bif unctional chemical entity comprises a nucleophile and an electrophile 
io as the two functionalities. 

6. The compound wherein the functionalities are protected by suitable protection groups. 

7. The compound of claim 1 to 6, wherein the bifunctional chemical entity is attached to the nucleoside derivative 
15 through a spacing element. 

8. The compound of claim 7, wherein the spacing element comprises a chemical bond which includes electrons from 
overlapping p-orbitals. 

20 9. The compound of claim 7 or 8, wherein the spacing element includes a triple bond, an aromatic- or heteroaromatic 
ring system, or a hetero atom. 

10. The compound of any of the claims 1 to 9, wherein the extended length between the bifunctional chemical entity 
and the nucleoside is between 3 and 12A. 

25 

11 . The compound of any of the preceding claims, wherein the bifunctional chemical entity is attached through a linker 
to the nucleobase of the nucleoside. 

12. The compound of claim 11 , wherein the nucleobase is selected among adenine, 7-deaza-adenine, uracil, guani- 
30 dine, 7-deaza-guanidine, thymine, and cytosine. 

13. The compound according to any of the preceding claims, wherein the bifunctional chemical entity is attached 
through a linker to the 5 position of pyrimidine type nucleobases or the 7 or 8 position of purine type nucleobases. 

35 14. The compound according to any of the preceding claims, wherein the extended length between the directing ele- 
ment and the bifunctional chemical entity is 4A or less. 

15. The compound according to claim 14, wherein the directing element is attracted or repulsed by major groove or 
intemucleoside linkage atoms of a nucleic acid double helix. 

40 

16. A nucleotide analogue comprising the compound according to any of the claims 1 to 15. 

17. The nucleotide analogue of claim 16 being a substrate for a polymerase. 

45 18. An oligonucleotide analogue comprising the compound according to any of the claims 1 to 15. 

19. The oligonucleotide according to claim 18, being a substrate for a ligase. 

20. A method for directing the structural orientation of a bifunctional chemical entity, wherein a nucleoside derivative 
so comprising a bifunctional chemical entity further comprises a directing element capable of positioning the bifunc- 
tional chemical entity in a predominate direction. 

21. The method of claim 20, wherein the directing element interacts with the major groove atoms or intemucleoside 
linkage of a double helix nucleic acid. 

55 

22. The method of claim 20 or 21 , wherein the directing element comprises an optionally substituted 5-, 6-, or 7-mem- 
bered ring structure. 
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15 



23. The method of claims 20 to 22, wherein the directing element comprises an aromatic or hetero-aromatic ring 
system. 

24. A method for obtaining an encoded molecule comprising 
Contacting a template nucleic acid with one or a plurality of compounds according to any of the claims 1 to 

1 9, under conditions that provides for the formation of a complementary template, 
Reacting the bifunctional chemical entities to form an encoded molecule. 

25. The method according to claim 24, wherein the template comprises an oligonucleotide and the conditions providing 
for the formation of a complementary template include a polymerase or a ligase. 

26. The method according to claim 24 or 25, wherein the compound is comprised in a nucleotide triphosphate. 

27. A polymer obtainable by the method according to any of the claims 24 to 26. 

28. A double stranded nucleic acid having one or more compounds according to any of the claims 1 to 1 8 incorporated 
therein. 

29. The double stranded nucleic acid according to claim 28, wherein the bifunctional chemical entities have been 
20 reacted under conditions in which they had a predominate orientation. 

30. The double stranded nucleic acid according to claim 28 or 29, wherein one or more linking moieties have been 
cleaved to display the encoded molecule. 

25 
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Fig. 6 
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Fig. 8 
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